-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: type annotations docs #96
Conversation
#+END_SRC | ||
|
||
The =declare= form starts with =elsa= followed by one or two values. If | ||
two are provided, first must be a list which is the list of input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to drop the parens around arguments and just have a sequence of types where the last one is automatically return type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, that would be awkward in case you want a void function or rather, ignore the return value. Requiring parentheses around arguments would model this naturally (nothing would follow them) whereas without them you'd always need to specify the last thing. Also, what if you have an empty argument list, but a meaningful return value (like, emacs-version
)? How would you tell both apart without parentheses around the arguments? If you need a better separator of arguments and return value, I like :
in the Wirth family, an arrow would be another option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can always inspect the argument list, so incase of (point)
you know the int
is the return value since there is no argument.
I have never considered void functions, such a thing does not really exist in Elisp does it? At the very least you always return nil
or some such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course there are void functions, everything you call for its side effect, like save-buffer
(returns nil
) or revert-buffer
(returns t
). What I'm pointing out is that we could avoid ambiguity with sexp-syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A void function returns nothing. Every function in elisp returns something.
Doing (string-to-number (save-buffer))
would error. We don't like errors, therefore we must always consider the return type... the above code is silly but might be a result of a failed refactoring or a mistake and we want to emit a warning there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing
(string-to-number (save-buffer))
would error.
I agree and my conclusion is that we should consider the void
return type. The result of a function returning void
should not be passed as argument to another function nor stored into a variable. When calling a void
function is the last statement of a function, then that function itself returns void
.
I agree with @Fuco1 that in ELisp, every function returns something. But not all functions are advertised as returning something: those functions should have the void
return type.
Consider save-buffer
for example and consider that no void
return type exists. A static analysis could detect that this function returns a bool
for example. Then, Elsa should accept code like this (if (save-buffer) ...)
. My opinion is that this is wrong because the implementation of save-buffer
can change tomorrow to return an int
instead. This will be an implementation detail until save-buffer
docstring describes what the return value is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Void is nasty because it doesn't play well with the type algebra.
What would this function return?
(defun what-is-return-type ()
(if (we-should-save-buffer) ;; bool
(save-buffer) ;; void
(return-time-since-last-save-in-seconds) ;; int
))
A type like void | int
makes no sense because void basically poisons everything. If you would want to use such a function you would have to test the return type anyway (did I get an int back?). So then the type of save-buffer
doesn't really matter.
It's a bit silly example, but the problem is that void is not a type in a sense every other type is a type (i.e. a set). Void is not even an empty set type, it's something outside the hierarchy.
This function
(defun forever-and-ever ()
(while t t))
also returns void? Or is that something else.
I have this issue #68 to check for unused return values so I would rather go with a "this function should be only used for side-effect" warning but not on the type level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are much more knowledgeable than me in this area. Do what you think is best!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A type like void | int makes no sense because void basically poisons everything.
Some languages (Rust, OCaml) have unit types for this purpose ()
. In this example, (or void int)
tells you that you may get a useful value (int
) or you may get a useless one (void
). You have to test the return type in order to use the int
, because you cannot do anything with void
. But that's the point of the type analysis: to make sure your code is correct in all situations.
If you would want to use such a function you would have to test the return type anyway (did I get an int back?). So then the type of save-buffer doesn't really matter.
But it does matter. Consider the alternative: (or bool int)
, in which case you may freely use the bool
return value even though it's undocumented, and may change in the future as argued by @DamienCassou. If it changes to int
, then if you previously made use of the bool
value your code won't typecheck anymore, but there is value in an annotation to indicate "this value is not relevant and subject to change".
This function
(defun forever-and-ever ()
(while t t))
also returns void? Or is that something else.
Something else: it's the empty type (or bottom type). It indicates that this function never returns, which is different than void
as argued above, which means returns something you cannot use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To complete: the empty type may be more tricky to deal with, but in practice may not be that useful. The void
unit type however, though by not mean indispensable. is useful.
This looks beautiful, Matus! Using the |
docs/type-annotations.org
Outdated
#+BEGIN_SRC elisp | ||
(defun drop-items (list &optional n) | ||
"Drop first item of LIST or N items if N is provided." | ||
(declare (elsa ((list a*) &optional int) (list a*))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The &optional
marker is not actually necessary since we can get that info from the function signature itself. I feel it might be a good idea to have it redundant for cross-checking, but it might prove more annoying than useful.
Same for &rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With &rest
there's also the issue that we might want to wrap the type with (list)
or assume that implicitly.
(elsa (&rest string) string
implicit list(elsa (&rest (list string)) string
(elsa (string) string)
if we decide to get rid of&rest
+ implicit list(elsa ((list string)) string)
if we decide to get rid of&rest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also &optional
implies (or nil X)
... I think it will be best to leave the markers in as it makes the whole thing read a lot easier.
Ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just* make &optional
and &rest
optional? The extent to which they help with readability rather depends on the argument list at hand. For the simplest functions, they don't add much: something like
(defun make-csv-row (&rest strings)
(declare (elsa (string) string))
(string-join strings ","))
reads perfectly naturally without a second &rest
, but the redundant annotations become useful very quickly as the argument list's complexity grows. I would think providing them should be considered good style, in general.
* "just" in the sense that "both!" is a simple answer to either/or questions; I don't have a clear sense of what the implementation difficulty would actually be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ambirdsall I kind of like this. I'm a bit afraid people will try to derive some extra significance out of it though. i.e. lack of &rest
meaning something as opposed to being optional. Documentation helps of course but it's still unobvious at first.
But I think we can make it optional and see the reactions. We can always deprecate that usage and ask users to fill in missing types (e.g. Elsa itself can check this and provide an error)
Sorry, could you clarify the question? Also, I can't seem to reply in that conversation... |
@alphapapa Not really a question just observation on how it looks. I think it could be implicitly understood by the presence of |
Elegant lispy API which fits nicely with the rest of Emacs. It seems super readable to me :) I like having the explicit I've never heard of intersection types before, do other languages have them? |
@Wilfred The intersection type is something used internally in the analysis. You would likely never write an annotation like that. In dynamic languages expressions can be "anything" and so we need to have a way (i.e. "set theory") to compute the possible run-time domains. You can read some more about it in the Flow (javascript static analysis) docs: https://flow.org/en/docs/types/intersections/ |
I still have no good idea on how to represent things like records ( That might actually work fine. I don't think we can do with just using the record/class names. It should also support generic classes, so something like The problem is that So we would mix a constructor semantic with some sort of "annotation" semantic. |
413d0ca
to
710cb3f
Compare
By default types do not accept =nil= values. This helps preventing | ||
errors where you would pass =nil= to a function expecting a value. To | ||
mark a type nullable you can combine it with =nil= type using the [[id:5a21a68a-4df1-4d44-a854-1d9700858a1a][or]] | ||
combinator: =(or string nil)= is a type which takes any string but also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the rationale for dropping the ?
suffix? I liked the fact that it was both expressive and short.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I like it and I would not oppose having it remain. It is a lot nicer to read int?
instead of (or int nil)
.
I don't want to overdo it with the suffixes though. There will also be the generic a*
syntax (or some other variant) and then it might matter on how we parse the suffixes (i.e. is a*?
the same as a?*
or does it even make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you can combine *
and ?
in the same type because what is before *
is an abstract name that conveys no meaning by itself. My opinion is that a*?
and a?*
should trigger an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you know Haskell I can think of this as an analogy of Maybe a
, that would be in our syntax a*?
(since a*
is the generic type constructor). So basically ?
is short for (or nil <arg>)
, similar as one can write (list a*)
a?*
probably makes no sense other than a?
would be the placeholder and that shouldn't be allowed indeed.
We can also put some things to the front, (e)lisp has a tradition with the &
prefix.
Or we can go entirely different way and for example use strings or quoted symbols for generic names, so that it would be "foo"
or (list "foo")
instead of foo*
.
|
||
** Most general type | ||
|
||
A type for everything is called =mixed=. It accepts anything and is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that "accepts anything" is a bit wrong here: apparently, mixed
does not accept [mixed]
(see #112).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should so that's a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue in #112 is the other way around though. We want an array [Mixed]
and we are getting Mixed
instead which might not be an array. So this is a mismatch.
- =(cons a b)= where =a= is the type of =car= and =b= is the type of =cdr=. If | ||
the =car= and =cdr= can be anything write =(cons mixed mixed)= or simply | ||
=cons= for short. | ||
- =(list a)= where =a= is the type of items in the list. If the list can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this syntax, you can't distinguish between:
- a list with unknown length and whose elements are all of the same type, and
- a list with length 1
I suggest (list a)
to represent case 2 and (list a...)
to represent case 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I never actually had in mind to have lists of finite/known length with the type constructor list
. I was thinking of using tuple
as short for (cons a (cons b (cons c nil)))
which is a list of three things of types a
, b
, and c
.
Soy idea was (tuple a b c)
and list
would be specifically for a list (of unknown length) with a repeated value.
I think I did not write it down anywhere, so I should. Does this make sense to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using tuple
probably makes more sense than to reuse list
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another thought at tuple
and I'm not sure about it anymore. The tuple
solution won't let you express that the first items are of some known type where the rest is unknown. Some use cases:
- the parameters to
format
((list string mixed...)
) - tagged lists where the first element is a symbol indicating how to interpret what follows (
(list symbol mixed...)
) - poor-man structures (e.g., a person could be described as a name, a birthdate and children with
(string date person...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's also a bit related to structures like plists or alists, where alists are lists of lists where the car
is something. These are good points.
In the future I also want to support "constant" types, so you can explicitly require specific values, for example, using your notation
(list (or 'regex 'syntax) string...)
could mean "this is a list of strings which should be interpreted as regular expressions if the car
is a constant 'regex
, or as syntax descriptors if 'syntax
.
The constants will probably have a constructor const
similar to how defcustom
does it. Maybe we can use that syntax? The defcustom
type (repeat string)
is a list of strings ("a" "b" "c" "d" ...)
where (list string integer symbol)
is a list of three things ("foo" 1 'bar)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We wanted to reuse the defcustom stuff so I dived into it and I realized that I really like the constructor-as-type principle.
There's also this cool pattern-matching library which uses a similar concept: https://github.com/VincentToups/shadchen-el
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So actually maybe we can lift the list-rest
idea from there. Or allow a &rest
keyword inside list
such that (list &rest int)
is a list of integers of any length and (list string bool &rest int)
requires a string, a bool and then a bunch of ints.
The same can work in parallel with vector
such that (vector string bool &rest int)
is... you get the idea.
This might be it! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this
Lists are made by chaining cons cells together and leaving =nil= in the
last =cdr= of the last =cons=. Therefore =(cons a (cons b (cons c nil)))=
can be understood as a list of exactly three items of type =a=, =b=, and
=c=.
Elisp provides a data constructor =list= to make creating such lists
easier. In analogy to =cons= we provide a =list= type which corresponds
to this, such that =(list "foo" 1 :bar)= has type =(list string int
keyword)=.
In general =(list a b c ...)= is a list type of known length where =a=, =b=,
=c=... are types of the items on the 0th, 1st, 2nd... places. It's a
proper list so the last =cons='s =cdr= is a =nil=.
Sometimes we need to express lists of unknown length which contain
items of the same thing repeatedly. For example =(3 2 5 1 5 6 7 5)= is
a list of numbers. Because we don't know its length beforehand we
provide a special syntax to express that a type repeats forever inside
the list.
The type =(list a b c . rest-type)= is a type similar to =list= except the
last =rest-type= repeats forever. Therefore, we expect a known number
of items of types =a=, =b=, =c= and then any number of items of =rest-type=.
If we only speficy one type it is automatically the rest type, so that
=(list . rest-type)= holds any number of items of =rest-type=.
If the list can hold any number of anything, write =(list . mixed)= or
simply =list= for short.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it very much, good job. I'm not so sure about this phrasing though:
If we only speficy one type it is automatically the rest type, so that
(list . rest-type)
holds any number of items ofrest-type
.
If the reader doesn't pay attention, she might understand that you are talking about (list rest-type)
and not (list . rest-type)
. May I suggest this phrasing instead?
If we specify no type before the dot
.
as in(list . rest-type)
, this means the list holds any number of items ofrest-type
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, I wasn't so sure myself but couldn't phrase it better.
I'm also thinking about maybe introducing some other shorter notation without the dot as that is confusing and we maybe might want to use it for dotted lists (albeit those are really rare so I wouldn't care about those very much).
Something like (list* a)
where the star denotes a repetition like in formal grammars and stuff. I'd probably move the generic *
marker before the type then.
docs/type-annotations.org
Outdated
A sum type is useful if the function internally checks the passed | ||
value and decides what processing to do: | ||
|
||
*TODO:* I think the double parens in the "argument" portion will lead to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the same problem as the double-parenthesis in the bindings of the let
form. I don't find it confusing but I don't know about others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More than confusing it's also a bit noisy I think.
(elsa ((function (a*) b*) (repeat a*)) (repeat b*))
for me it's very hard to get a good overview of where the arguments are and where the return type is.
(elsa (function (a*) b*) (repeat a*) -> (repeat b*))
is to me a lot more visual.
In my original thoughts I actually had two annotations (elsa-args)
and (elsa-return)
. Not sure if that's better though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me the idiomatic lisp thing would be to break the s-expression onto multiple lines and use indentation to visually indicate structure. The ->
notation is nice and efficient for one-line type annotations, but in the real world it seems likely that people would get that visual reinforcement of the type structure by writing
(elsa ((function (a*) b*)
(repeat a*))
(repeat b*))
instead of
(elsa ((function (a*) b*) (repeat a*)) (repeat b*))
at least within elisp code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really nice idea and should be easily implemented using the indent engine. When you lay out out like this the structure actually makes sense too!
I will update the examples to use this layout and hopefully it catches on.
710cb3f
to
693f23c
Compare
arguments. | ||
|
||
Composite types usually correspond to data constructors such as =cons=, | ||
=list=, =vector=... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you specify a plist where odd items are symbols and even items are strings? E.g.,:
`(:background "#583333" :foreground "#29457")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For plists I'm planning to have a special type (plist key value)
, same for alist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it will be a bit more involved since you might also want to express what keys exactly are allowed. For that I'm pondering a set
type, such that
(set :foo :bar :baz)
Allows any of :foo
, :bar
, :baz
zero or once and nothing else.
Then
(plist (set :foo :bar :baz) string)
Is a type of a plist with string values and keys :foo
, :bar
, :baz
.
Now this still does not allow us to specify what value type we want for what key type exactly. In javascript (flow/typescript) they do it simply by enumerating:
{
name: string,
age: int
}
so maybe we can also have such a syntax:
(plist :foo string :bar int)
I'm not decided yet here what is the best way. Ideally somehow combine it all for maximal flexibility.
Some more points of things currently impossible to express: #117 (comment) |
fdbe3d4
to
00fdb5f
Compare
I've implemented the "lispification" of the type annotation language. I want to merge this to avoid the code diverging as I work on more features, but I don't want to lose the unresolved discussion here. Anyone has a good idea what to do? Create a new pull request and merge that leaving this open? |
See also #74.